Causal Association of Golgi Protein 73 With Coronary Artery Disease: Evidence from Proteomics and Mendelian Randomization

Background: Identification of the unknown pathogenic factor driving atherosclerosis not only enhances the development of disease biomarkers but also facilitates the discovery of new therapeutic targets, thus contributing to the improved management of coronary artery disease (CAD). We aimed to identify causative protein biomarkers in CAD etiology based on proteomics and 2-sample Mendelian randomization (MR) design. Methods: Serum samples from 33 first-onset CAD patients and 31 non-CAD controls were collected and detected using protein array. Differentially expressed analyses were used to identify candidate proteins for causal inference. We used 2-sample MR to detect the causal associations between the candidate proteins and CAD. Network MR was performed to explore whether metabolic risk factors for CAD mediated the risk of identified protein. Vascular expression of candidate protein in situ was also detected. Results: Among the differentially expressed proteins identified utilizing proteomics, we found that circulating Golgi protein 73 (GP73) was causally associated with incident CAD and other atherosclerotic events sharing similar etiology. Network MR approach showed low-density lipoprotein cholesterol and glycated hemoglobin serve as mediators in the causal pathway, transmitting 42.1% and 8.7% effects from GP73 to CAD, respectively. Apart from the circulating form of GP73, both mouse model and human specimens imply that vascular GP73 expression was also upregulated in atherosclerotic lesions and concomitant with markers of macrophage and phenotypic switching of vascular smooth muscle cells (VSMCs). Conclusions: Our study supported GP73 as a biomarker and causative for CAD. GP73 may involve in CAD pathogenesis mainly via dyslipidemia and hyperglycemia, which may enrich the etiological information and suggest future research direction on CAD.


Causal Association of Golgi
The genome-wide association studies (GWAS) in the Mendelian randomization study Table S2 Genetic instruments in AGES study used in Mendelian randomization analysis Table S3 The associations between genetic variant with outcome Table S4 Description of GSE100927 and GSE28829 Table S5 Differentially expressed proteins between the CAD group and control group (P-value<0.05 and log2FC>0.263or log2FC< -0.263)Table S6 The causal association between candidate protein with CAD in Mendelian randomization Table S7 Replication of causal association between GP73 level with coronary artery disease using CAD

GWAS in FINNGEN study
Table S8 Causal associations between GP73 level and atherosclerosis disease Table S9 Discovery and replication of causal association between GP73 and metabolic risk factors Table S10 Gene Ontology (GO) -Biological Process for GOLM1 Gene atherosclerosis outcomes sharing similar etiology (myocardial infarction, stroke and PAD) were performed; (5) Further validating the robustness of the causality using other optional MR methods and external replication in other GWAS; (6) Performing network Mendelian randomization to explore the role of metabolic risk factors of CAD in the causal pathway from identified causal agents to CAD; (7)   Investigating the association between vascular expression of the identified protein in situ with atherosclerosis using both mouse and human samples.

Figure S2 Framework of network Mendelian randomization 1
Step 1: Assess the causality between candidate protein with CAD; Step 2: Assess the causality between candidate protein with metabolic risk factors; Step 3: Assess the causality between metabolic risk factors with CAD.Only when causality exists in all three steps, the metabolic risk factor was considered as a mediator.

Figure S3 Funnel plot for MR analysis of causal effect of GP73 on CAD
Outcome name: CAD, coronary artery disease; PAD, peripheral artery disease.GWAS names: CARDIoGRAMplusC4D, Coronary ARtery DIsease Genome-wide Replication and Meta-analysis (CARDIoGRAM) plus The Coronary Artery Disease (C4D) Genetics; GLGC, Global Lipids Genetics Consortium; MAGIC, Meta-Analyses of Glucose and Insulin-related traits Consortium.a Inverse variance weighted (random-effect) method; b Inverse variance weighted (fixed -effect) method.Information on gender, age, and smoking status were self-reported.After a 15 minutes rest in a sitting position, systolic blood pressures (SBPs) and diastolic blood pressures (DBPs) were measured three times every 5 minutes at right arm using mercury sphygmomanometer, and the mean values were used for analysis.Standing height and weight were measured with the subjects wearing light clothes without shoes in a standardized posture.Body mass index (BMI) was calculated as weight (in kilograms) divided by squared height (in meters).Hypertension (HTN) was ascertained if a participant has systolic blood pressure ≥ 140 mmHg or diastolic blood pressure ≥ 90 mmHg, prior history of hypertension, or antihypertensive medication 10 .We defined diabetes mellitus (DM) as fasting glucose ≥126 mg/dL, non-fasting glucose ≥ 200 mg/dL, A1C ≥ 6.5%, self-report of a previous diabetes diagnosis, or taking medication for diabetes 11 .Total cholesterol, low-density lipoprotein cholesterol, and triglycerides were detected by the automatic enzyme method.Glycated hemoglobin (HbA1c) was tested via automated immunochemistry method.Creatine kinase-MB (CK-MB), and Cardiac troponin T (cTNT) were measured using immunoassays.

Text S2 Protein arrays protocol
A. Completely Air Dry The Glass Slide 1.Take out the glass slide from the box, and let it equilibrate to room temperature inside the sealed plastic bag for 20-30 minutes.Remove slide from the plastic bag, peel off the cover film, and let it air dry for another 1-2 hours.
Incomplete drying of slides before use may cause the formation of "comet tails," thin directional smearing of antibody spots.

B. Blocking & Incubation
2. Add 100 µl Sample Diluent into each well and incubate at room temperature for 30 minutes to block slides.
3. Decant buffer from each well.Add 100 µl of sample to each well.Incubate arrays at room temperature for 1-2 hour.
Longer incubation time is preferable for higher signals.This step may be done overnight at 4°C.
We recommend using 50 to 100 µl of original or diluted serum, plasma, conditioned media, or other body fluid, or 50-500 µg/ml of protein for cell and tissue lysates.Cover the incubation chamber with adhesive film during incubation, especially if less than 70 ul of sample or reagent is used.

Wash:
• Decant the samples from each well, and wash 5 times (5 min each) with 150 µl of 1X Wash Buffer I at room temperature with gentle shaking.Completely remove wash buffer in each wash step.Dilute 20x Wash Buffer I with H2O.
• (Optional for Cell and Tissue Lysates) Put the glass slide with frame into a box with 1X Wash Buffer I (cover the whole glass slide and frame with Wash Buffer I), and wash at room temperature with gentle shaking for 20 min.
• Decant the 1x Wash Buffer I from each well, wash 2 times (5 min each) with 150 µl of 1X Wash Buffer II at room temperature with gentle shaking.
Completely remove wash buffer in each wash step.Dilute 20X Wash Buffer II with H2O.
"Incomplete removal of the wash buffer in each wash step may cause "dark spots," the background signals higher than the spots.

C. Incubation with Biotinylated Antibody Cocktail & Wash
5. Reconstitute the detection antibody by adding 1.4 ml of Sample Diluent to the tube.Spin briefly.
6. Add 80 µl of the detection antibody cocktail to each well.Incubate at room temperature for 1-2 hour.
Longer incubation time is preferable for higher signals 7. Decant the samples from each well, and wash 5 times (5 mins each) with 150 µl of 1X Wash Buffer I and then 2 times with 150 µl of 1x Wash Buffer II at room temperature with gentle shaking.
Completely remove wash buffer in each wash step.12. Place the slide in the Slide Washer/Dryer (a 4-slide holder/centrifuge tube), add enough 1x Wash

D. Incubation with Cy3 Equivalent
Buffer I (about 30 ml) to cover the whole slide, and then gently shake at room temperature for 15 minutes.Decant Wash Buffer I. Wash with 1x Wash Buffer II (about 30 ml) and gently shake at room temperature for 5 minutes.
13. Remove water droplets completely by gently applying suction with a pipette to remove water droplets.Do not touch the array, only the sides.
14. Imaging: The signals can be visualized through use of a laser scanner equipped with a Cy3 wavelength (green channel) such as Axon GenePix or Innopsys Innoscan.

F. Data Analysis
15. Data extraction can be done using the GAL file that is specific for this array along with the microarray analysis software (GenePix, ScanArray Express, ArrayVision, MicroVigene, etc.).
Describe statistical methods and statistics used.Methods -Two-sample Mendelian Randomization -Paragraph 1-2; Table S1.d) If applicable, say how multiple testing was dealt with.

Assessment of assumptions
Describe any methods used to assess the assumptions or justify their validity.

Sensitivity analyses
Describe any sensitivity analyses or additional analyses performed.
Methods -Two-sample Mendelian Randomization -Paragraph 2-3.b) State whether the study protocol and details were pre-registered (as well as when and where). N/A.

Descriptive data
For two-sample Mendelian randomization: Provide information on extent of sample overlap between the exposure and outcome data sources.
Methods -Publicly available GWAS summary data for 2-sample MR analyses -Paragraph 1.

Main results
a) Report the associations between genetic variant and exposure, and between genetic variant and outcome, preferably on an interpretable scale (e.g.Result Figure 2; Figure S3.

Assessment of assumptions
a) Assess the validity of the assumptions.
Results Table 2; Figure S3.b) Report any additional statistics (e.g., assessments of heterogeneity, such as I 2 , Q statistic).
Resultsparagraph 3. b) Discuss underlying biological mechanisms that could be modelled by using the genetic variants to assess the relationship between the exposure and the outcome.

Generalizability
Discuss the generalizability of the study results (a) to other populations (i.e.external validity), (b) across other exposure periods/timings, and (c) across other levels of exposure.
Discussion -Strengths and limitations.

Text S4 Concept of mediation analysis
Mediation analysis determines if potential mediators mediate the association between independent variable with dependent variable and quantifies their contribution 113 .Several criteria should be satisfied in mediation analysis: 1) Independent variable (X) must be significantly associated with dependent variable (Y) (The effect from X to Y is c); 2) X must be significantly associated with mediator (M) (The effect from X to M is a); 3) M must be significantly associated with Y (The effect from M to Y is b).c' is the effect of from X to Y bypassing the investigated mediator, calculated through adjustment for the investigated mediator in model.Based on the design of mediation analysis (as shown in the following Figure ), the total effect of X on Y (c) was divided into direct effect (c', the effect that was not transmitted by selected mediators) and indirect effect (a*b, the effect that was mediated by the investigated mediator) (c=a*b+c').Proportion mediated (PM, %) was defined as indirect effect/total effect (PM= a*b /c).

Figure Frame of mediation analysis
In the present study, we used mediation analysis to assess the extent to which the association of GP73

Text S5 Information on experimental procedures
All experimental procedures involving animals were performed according to the Guide for the Care and Use of Laboratory Animals (NIH Publication, 8th edition, 2011) and were approved by the ethic committees of Sun Yat-sen University.
ApoE−/− mice (8weeks; male; Vital River, Beijing, China) were housed in specific pathogen-free conditions and randomly allocated into either high-fat diet (15% lard, 20% sugar, and 1.2% Cholesterol; n = 5) or normal diet group (n = 5) for 12 weeks.High-fat diet was used to induce atherosclerosis and normal diet group served as the control group.After the modeling is completed, mice were weighted then anesthetized by intraperitoneal injection of 1% pentobarbital sodium.Regular vital signs (respiratory rate and pulse) but absence of toe-pinch reflex indicated the adequacy of anesthesia.Blood samples was collected for extracting Serum specimens were extracted from whole blood samples and stored at -80℃ until measurement.Aortic tissues were isolated for Western blotting and immunofluorescence staining.The aortic roots were processed into optimum cutting temperature compound (OCT) and sequentially sectioned into 8 µm-thick with a cryostat.Enzymatic kits (Jiancheng Biotechnology, Nanjing, China) were implemented for measurements of plasma total cholesterol, low-density lipoprotein cholesterol and high-density lipoprotein cholesterol.

Figure
Figure S1 Flowchart of study design Figure S1 Flowchart of study design

Dye-Streptavidin & Wash 8 . 10 .E. Fluorescence Detection 11 .
After briefly spinning down, add 1.4 ml of Sample Diluent to Cy3 equivalent dye-conjugated streptavidin tube.Mix gently.9. Add 80 µl of Cy3 equivalent dye-conjugated streptavidin to each well.Cover the device with aluminum foil to avoid exposure to light or incubate in dark room.Incubate at room temperature for 1 hour.Decant the samples from each well, and wash 5 times (5 mins each) with 150 µl of 1X Wash Buffer I at room temperature with gentle shaking.Completely remove wash buffer in each wash step.Disassemble the device by pushing clips outward from the slide side.Carefully remove the slide from the gasket.

a)
Describe how quantitative variables were handled in the analyses (i.e., scale, units, model).N/A.(Two-sample design) b) Describe the process for identifying genetic variants and weights to be included in the analyses (i.e, independence and model) Methods -Selection of genetic instruments -Paragraph 1.c) Describe the MR estimator, e.g.two-stage least squares, Wald ratio, and related statistics.Detail the included covariates and, in case of two-sample MR, whether the same covariate set was used for adjustment in the two samples.

9 .
Software and pre-registration a) Name statistical software and package(s), including version and settings used.Methods -Statistical Analysis -Paragraph 1.

a)
Use sensitivity analyses to assess the robustness of the main results to violations of the assumptions.Resultsparagraph 3. b) Report results from other sensitivity analyses (e.g., replication study with different dataset, analyses of subgroups, validation of instrument(s), simulations, etc.) Results -Paragraph 4-5 (External replication, secondary outcome with similar etiology).c) Report any assessment of direction of causality (e.g., bidirectional MR).Discuss limitations of the study, taking into account the validity of the MR assumptions, other sources of potential bias, and imprecision.Discuss both direction and magnitude of any potential bias, and any efforts to address them.Discussion -Strengths and Limitations -Paragraph 2. 16.Interpretation a) Give a cautious overall interpretation of results considering objectives and limitations.Compare with results from other relevant studies.Discussion -Paragraph 1-2.
whether the results have clinical or policy relevance, and whether interventions could have the same size effect.Discussion -Paragraph 2-4.

Table S6 The causal association between candidate protein with CAD in Mendelian randomization UniprotID Gene symbol Protein full name Pheterogeneity
a Inverse variance weighted (random-effect) method;b Inverse variance weighted (fixed-effect) method; c Wald ratio method

Table S7 Replication of causal association between GP73 level with coronary artery disease using CAD GWAS in FINNGEN study
a Inverse variance weighted (fixed-effect) method

Table S8 Causal associations between GP73 level and atherosclerosis disease
b Inverse variance weighted (fixed-effect) method.

Introduction on RED-CARPED study and data collection method
REal-world Data of CARdiometabolic ProtEcTion is a single-center, ambispective cohort study aimed at identifying risk factors associated with metabolic cardiovascular diseases and explore their relationship with long-term cardiovascular endpoints (registration number: ChiCTR2000039901).This is achieved through conducting long-term follow-up of patients with metabolic cardiovascular diseases in a real-world setting.Patients admitted to the Cardiology department of the First Affiliated Hospital of Sun Yat-sen University with metabolic cardiovascular diseases between 2003 and 2033 were consecutively enrolled.The registry includes patients with any of the following conditions: coronary heart disease, hypertension, heart failure, stroke, diabetes, obesity, dyslipidemia, or hyperuricemia.
comparing 25 th and 75 th percentile of allele count or genetic risk score, if individual-level data available).